Back

Mobile DNA

Springer Science and Business Media LLC

Preprints posted in the last 90 days, ranked by how well they match Mobile DNA's content profile, based on 27 papers previously published here. The average preprint has a 0.01% match score for this journal, so anything above that is already an above-average fit.

1
Hide and seek: de novo identification in sugar beet reveals impact of non-autonomous LTR retrotransposons

Maiwald, S.; Maiwald, F.; Heitkam, T.

2026-03-03 genomics 10.64898/2026.03.01.708851 medRxiv
Top 0.1%
52.3%
Show abstract

Plant genomes are filled with retrotransposons and their derivatives, subject to constant sequence turnover. As short, non-autonomous retrotransposons do not encode a protein product, they experience reduced selective constraints on their DNA sequence, leading to diversification into multiple families, usually limited to only a few species. This absence of any coding capacity and their tendency to form subfamilies are the reasons for the incomplete description of non-autonomous LTR retrotransposons in most to all genomic repeat annotations. Here, we focus on non-autonomous LTR retrotransposon identification. Are all of these sequences derivatives of easier-to-identify full-length elements? Or is there more variability, which is currently overlooked? For this, we capitalize on our comprehensive understanding of the TE landscape in sugar beet to assess the extent of the blind spot on non-autonomous LTR retrotransposons Here, we present a workflow to identify non-autonomous LTR retrotransposons without prior sequence information, retrieving more than 100 families within the sugar beet genome. We only include TEs without the ability for complete self mobilization. Spanning up to 15,000 bp, these non-autonomous families are often longer than expected and characterized by reshuffling and modular evolution. Most strikingly, only a few of these families are directly derived from autonomous partners, showing that there is a large, undiscovered TE variety in the non-autonomous TE fraction. We highlight that a large fraction of non-autonomous TEs wont be retrieved with the current TE identification workflows, even if the output is well-curated and condensed into TE libraries and suggest procedures to remedy this gap. This study is the first insight into the non-autonomous LTR retrotransposon landscape within a single genome and serves as an example to estimate the error in non-autonomous TE detection.

2
ATHILAfinder: a tool to detect ATHILA LTR retrotransposons in plant genomes

Bousios, A.; Primetis, E.

2026-03-22 bioinformatics 10.64898/2026.03.20.713144 medRxiv
Top 0.1%
22.8%
Show abstract

MotivationThe ATHILA lineage of LTR retrotransposons has colonised all branches of the plant tree of life. In Arabidopsis thaliana and A. lyrata, ATHILA elements have invaded centromeres, influencing the genetic and epigenetic organisation, and driving satellite evolution. To assess the broader significance of ATHILA across plants, a computational pipeline is needed to identify ATHILA elements with high efficiency. Existing tools lack this ability because they are optimised for broad transposon classification at the expense of precise annotation of lower taxonomic levels. ResultsWe present ATHILAfinder, a pipeline for accurate and large-scale discovery of ATHILA elements. ATHILAfinder uses lineage-specific sequence motifs as seeds and additional filters to build de novo intact elements. Homology-based steps rescue intact ATHILA and identify soloLTRs. A detailed identity card includes coordinates, LTR identity, coding capacity, length and other sequence features for every ATHILA. We validate ATHILAfinder in the A. thaliana Col-CEN assembly and five additional Brassicaceae species, covering four supertribes and [~]30 million years of evolution. ATHILAfinder has very low false positive rates and outperforms widely-used tools like EDTA and the deep-learning-based Inpactor2 software for both recovery and precision of ATHILA. To demonstrate its usefulness, we generate insights into ATHILA dynamics across Brassicaceae. OutlookFew computational pipelines target specific transposon lineages, yet such tools can empower their identification and downstream analyses. Our tailored approach can be adapted to other LTR retrotransposon lineages, offering new ways for high-resolution analysis of transposons.

3
Biological implications of a detailed repeat annotation in Octopus vulgaris

Bonar, M.; Elliot, T. A.; Ahmadi, M. A.; Cottenie, K.; Linquist, S.

2026-03-05 genomics 10.64898/2026.03.03.709284 medRxiv
Top 0.1%
22.2%
Show abstract

Octopuses are phenotypically distinctive organisms, and recent genomic work raises questions about the contributions of transposable elements (TE) to their genomic architecture. We leveraged a robust repeat annotation pipeline, in combination with manual and automated curatorial techniques, to produce a more comprehensive repeat annotation of Octopus vulgaris. This revealed that [~]66% of the genome are repeats, in contrast to previous estimates of 43-50%. Whereas previous studies of TE expansion in Octopus bimaculoides identified two bursts of activity, 25 and 56 MYA, our re-annotation revealed four such expansions at 18, 25, 33, and 56 MYA. We further identified a landscape of TE hot- and cold spots. This much refined TE timescape and landscape will serve as a useful basis for understanding TE contributions to O. vulgaris evolution, and also for identifying factors contributing to variation in the TE community across genomic space and evolutionary time.

4
Benchmarking computational tools for locus-specific analysis of transposable elements in single-cell RNA-seq datasets

Finazzi, V.; Vallejos, C. A.; Scialdone, A.

2026-02-28 bioinformatics 10.64898/2026.02.26.708244 medRxiv
Top 0.1%
14.3%
Show abstract

BackgroundTransposable elements (TEs) are increasingly recognized as regulators of gene expression and cellular identity in development and disease. Single-cell RNA-sequencing (scRNA-seq) enables the analysis of their transcription at cellular resolution, but the repetitive nature of TEs and their frequent overlap with genes create substantial mapping ambiguity. Although several tools quantify TE expression, few support locus-specific analysis, and their performance in single-cell data has not been systematically evaluated. ResultsWe present a comprehensive benchmarking framework for locus-level TE quantification in short-read scRNA-seq, combining real datasets with simulations that provide read-level ground truth. TE-derived reads constitute a considerable fraction of the transcriptome and capture meaningful biological structure. Our simulations reveal that older, sequence-diverged insertions can be quantified with relatively high accuracy, whereas young TEs remain intrinsically difficult to resolve due to unreliable assignment of multi-mapping reads. We observe pronounced family-specific biases and identify gene-TE disambiguation as a major unresolved challenge. Among evaluated methods, SoloTE (unique-mapper mode) and Stellarscope (with an expectation-maximization-based reallocation of multi-mappers) showed comparable performance, while including multi-mappers generally increased false positives without substantially improving locus-level accuracy. ConclusionsOur benchmark delineates the fundamental limits imposed by short-read scRNA-seq on locus-specific TE quantification, providing practical guidance for prospective users. Suggested best practices include focusing locus-level analyses on older insertions, applying unique-mapper strategies to improve precision, aggregating counts at the subfamily level for young TEs, and explicitly checking for gene-TE overlaps. Our workflow is fully reproducible and extensible, providing a foundation for evaluating emerging methods aimed at resolving TE transcription at single-locus resolution.

5
Population-scale discovery and analysis of non-reference endogenous retrovirus insertions in wild house mice

Yano, T.; Takada, T.; Fujiwara, K.; Watabe, D.; Hirose, S.; Masuya, H.; Endo, T.; Osada, N.

2026-02-20 evolutionary biology 10.1101/2025.09.23.678169 medRxiv
Top 0.1%
12.2%
Show abstract

Endogenous retroviruses (ERVs) represent a major source of structural variation in mammalian genomes, yet their diversity in wild populations remains poorly understood. Here, we conduct a comprehensive genome-wide survey of non-reference ERV insertions in wild house mice (Mus musculus) to characterize their distribution and evolutionary dynamics. Using a newly developed bioinformatics pipeline, we detected and annotated over 100,000 non-reference ERV insertions from short-read sequencing data across 163 wild mouse genomes. Our analyses revealed marked differences in ERV insertion patterns among subspecies and populations, including variation in genomic localization and population-specific polymorphisms. These heterogeneous patterns suggest distinct evolutionary histories and host-retrovirus interactions across populations. For instance, we describe the distribution of the ERV-derived Fv4 locus, which shows subspecies-restricted occurrence and confers resistance to murine leukemia viruses (MLVs). Several lines of evidence showed that the spread of Fv4 insertions in Korean population has been driven by adaptive introgression from neighboring populations. Our study provides the first large-scale population genomic scan of ERV diversity in wild house mice. By cataloguing extensive polymorphism in non-reference ERV insertions, our results highlight the role of ERVs as dynamic genomic elements that contribute to structural variation and adaptive evolution. Article SummaryEndogenous retroviruses (ERVs) are viral sequences embedded in animal genomes that can create structural genetic variation. In this study, we conducted a genome-wide survey of non-reference ERV insertions in 163 wild house mice using short-read sequencing data and a newly developed computational pipeline. We identified more than 100,000 polymorphic ERV insertions and found substantial differences among subspecies and geographic populations. One example, the ERV-derived Fv4 locus, illustrates how ERV variation can influence the genetic pattern of polymorphisms in the species. These results demonstrate that ERVs are dynamic genomic elements that contribute to population divergence and adaptive evolution.

6
An Alternative DNA Endonuclease Activity is Associated with the LINE-1 ORF2-encoded Protein

Nakamura, M.; Kopera, H. C.; Dowling, M.; Barabas, O.; Moran, J. V.

2026-01-20 molecular biology 10.64898/2026.01.20.700639 medRxiv
Top 0.1%
10.5%
Show abstract

Long INterspersed Element-1 (L1) retrotransposons use activities contained within the L1 open reading frame 2-encoded protein (ORF2p) to mobilize throughout the genome via target-site primed reverse transcription (TPRT). The ORF2p endonuclease domain (EN) cleaves genomic DNA to liberate a 3-hydroxyl (3-OH) group that is used by the ORF2p reverse transcriptase domain (RT) to synthesize a cDNA copy of its bound L1 RNA template. L1 also can move by EN-independent retrotransposition (ENi), where a 3-OH group at genomic DNA lesions, dysfunctional telomeres, or stalled replication forks is proposed to prime L1 reverse transcription in the absence of L1 EN cleavage. We previously reported that ribonucleoprotein (RNP) preparations from cells transfected with a human wild-type (WT) L1 or L1 EN-mutant, but not an L1 RT-mutant, can initiate reverse transcription from a DNA oligonucleotide primer/L1 RNA template complex. The WT and EN-deficient L1 RNP preparations also are associated with a nuclease activity that can process a 3 end modification that precludes DNA synthesis from an oligonucleotide prior to priming the L1 RT reaction. Here, we purified recombinant full-length WT, L1 EN-, and L1 RT-mutant human L1 ORF2p from insect cells. We report that the WT and L1 EN-mutant, but not the L1 RT-mutant, contain an alternative endonuclease activity (alt-EN). Alt-EN activity also is detected in a bacterially expressed L1 ORF2p protein that lacks the L1 EN and ORF2p cysteine-rich domains and a thermostable group II intron-encoded protein. Processing of diverse modified primers demonstrates endonucleolytic cleavage that is eliminated by mutations in the RT active site. We propose that alt-EN is an evolutionarily conserved activity within the RT fold that promoted ENi retrotransposition of primordial retrotransposons prior to the acquisition of an EN domain.

7
TEExplorer: A Web Portal to Investigate TE-Epigenome Associations Across Human Cell Types

Hyacinthe, J.; Lougheed, D. R.; Bourque, G.

2026-02-19 genomics 10.64898/2026.02.18.706470 medRxiv
Top 0.1%
8.3%
Show abstract

Approximately half of the human genome is derived from transposable elements (TEs) and several studies support the involvement of TEs in genome regulation in development, immunity and disease. We previously leveraged 4614 ChIP-seq samples from the International Human Epigenome Consortium (IHEC) EpiATLAS dataset and did a comprehensive analysis of the relationship between TEs and 6 histone marks across 57 human cell types. However, with over 6 million measurements of TE / histone mark / cell type enrichment, it was challenging to navigate the results and it was not possible to integrate them with user data. To address this, we developed a web tool, TEExplorer, which makes available TE overlaps and enrichments in an accessible and intuitive manner. The tool presents an interactive view of TE families and subfamilies, with their overlap and enrichments across histone marks and cell types. Finally, the tool allows users to upload their own ChIP-seq BED file to obtain the TE overlap and enrichment relative to random controls and compare their data with the EpiATLAS dataset. With TEExplorer, researchers with an interest in a particular TE family or subfamily, histone mark, or cell type, or those bringing their own ChIP-seq dataset, can dynamically explore and contrast hundreds of associations found within the large EpiATLAS dataset. AvailabilityOnline portal: https://teexplorer.c3g.sd4h.ca

8
Evaluating the reliability of tools for mRNA annotation and IRES studies

May, G. E.; Akirtava, C.; McManus, J.

2026-03-31 genomics 10.64898/2026.03.29.707813 medRxiv
Top 0.1%
6.4%
Show abstract

Since the discovery of viral Internal Ribosome Entry Sites (IRESes), researchers have sought to find similar elements in mammalian host genes, termed "cellular IRESes". However, the plasmid systems used to measure cellular IRES activity are vulnerable to false positives due to promoter activity in candidate IRESes. Orthogonal methods are needed to validate putative IRESes while carefully avoiding artifacts known to cause false positives. Recently, Koch et al. proposed approaches for studying IRESes, primarily circular RNA-generating plasmids, and for validating mRNA transcripts using smFISH and qRT-PCR. Here, we demonstrate confounding variables and artifacts in each of these approaches that can lead to inappropriate conclusions about potential cellular IRES activity. We show the back-splicing circRNA plasmid creates linear mRNA artifacts associated with false-positive IRES signals. Using orthogonal, gold-standard assays validated with viral IRESes, we find putative cellular IRESes reported using the back-splicing plasmid have no IRES activity. Furthermore, we demonstrate that smFISH and qRT-PCR can misidentify nuclear non-coding RNAs as mRNAs and we validate a single molecule sequencing assay for identifying genuine mRNA 5 ends. Our work establishes reliable methods for robust transcript annotation and IRES studies that avoid documented artifacts arising from bicistronic and back-splicing circRNA plasmid reporters.

9
Conservation of Long G4-rich (LG4) genomic enhancer regulations

Shaw, M. H.; DeMeis, J. D.; Arnold, C. A.; Cox, M. R.; Duong, T. C.; Gaviria, K. A.; McDavid, G. K.; Villegas, J. M.; Weimer, M. L.; Patil, S. S.; Alqudah, S. Y.; Borchert, G. M.

2026-03-13 genomics 10.64898/2026.03.11.711068 medRxiv
Top 0.1%
6.1%
Show abstract

Long G4-rich regions (LG4s) are defined as DNA sequences containing a high density of guanine triplets capable of forming non-B DNA structures called G-quadruplexes (G4s). These regions frequently overlap with enhancers, which are regulatory DNA elements that modulate gene expression by interacting with DNA regions that dictate where transcription is initiated known as promoters. While LG4s have now been well-characterized in the human genome, neither LG4 occurrence, nor the ability of LG4s to function as enhancers, in other species has been described. To address this, we screened the genomes of 16 different species from various taxa to identify LG4s and then determined if they were conserved, and if so, if their regulatory capacity was similarly conserved. Our analyses characterized a number of previously unreported LG4s in the human genome as well as LG4s in 13 additional species. Of note, we identified a highly conserved LG4 enhancer predicted to regulate over 40 genes. This LG4 is embedded in the MAZ (Myc-Associated Zinc finger protein) locus, and we find this LG4 possesses the ability to directly interact with the same target promoter in both human and mouse. In summary, this work describes LG4s in the genomes of both unicellular and multicellular species including vertebrates, invertebrates, plants, and fungi. Furthermore, many of these LG4 sequences are highly conserved as is their regulatory capacity.

10
lncOriL, a novel polyadenylated mitochondrial lncRNA common to zebrafish and human

Jorgensen, T. E.; Wardale, A.; Wolf Profant, S.; Amundsen, C.; Emblem, A.; Joakimsen, I. S.; Brekke, O.-L.; Karlsen, B. O.; Babiak, I.; Johansen, S. D.

2026-03-27 molecular biology 10.64898/2026.03.26.714394 medRxiv
Top 0.1%
4.0%
Show abstract

Even though teleost fish and mammals share the same mitochondrial gene content and organization, the teleost mitochondrial transcriptome is still poorly understood. We characterized the mitochondrial transcriptome during zebrafish (Danio rerio) early development by long-read direct RNA sequencing. All heavy-strand specific mRNAs were found to carry 3 poly-A tails of approximately 50-60 residues, and the transcriptome profile was distinctive but practically invariant between stages. Three unusual transcripts were however noted. These included two mRNAs (COI and ND5 mRNAs), with significant 3 untranslated regions corresponding to antisense gene sequences, and a previously not described noncoding RNA named here lncOriL. The ND5 mRNA was found to carry one third of all detected m6A methylation sites in the zebrafish mitochondrial transcriptome. The 313 nt-long lncOriL transcript had an abundance comparable to that of ND5 mRNA and it mapped to mitochondrial genome region covering the origin of light strand replication and four flanking antisense tRNAs. A mitochondrial tRNA-derived fragment (tiRNA5-Asn), with a 35 nt perfect pairing-potential to lncOriL, was present at all stages. Additional analyses including adult zebrafish, scissortail (Rasbora rasbora), and monkfish (Lophius piscatorius) strongly corroborate the results of COI mRNA, ND5 mRNA, and lncOriL transcript prevalence among teleost fish. Surprisingly, our findings in zebrafish were further supported by mitochondrial transcriptome analyses in domestic pig (Sus scrofa) and human (Homo sapiens), including tiRNA5-Asn commonly present in human tissues, suggesting that lncOriL is ubiquitously expressed and regulated in vertebrates. Author SummaryMitochondria contain their own genome and produce essential RNAs needed for energy production. Although fish and mammals share the same mitochondrial gene organization, less is known about how mitochondrial RNAs are processed and regulated in teleost. Using Nanopore direct RNA sequencing, we examined mitochondrial RNAs during early zebrafish development and discovered three unusual transcripts that include extended non-coding regions. Two of these molecules, COI and ND5 mRNAs, carry long 3' untranslated regions formed by antisense gene sequences, suggesting previously unrecognized regulatory potential. We also identified lncOriL, a highly structured long noncoding RNA that spans the origin of light-strand replication and is abundant during development. Strikingly, the same RNA feature, including lncOriL and a matching tRNA-derived small RNA (tiRNA5-Asn), was found not only in zebrafish but also in human mitochondrial transcriptomes. These findings support conservation of regulatory mitochondrial RNAs across main groups of vertebrate species. Our work reveals a new layer of mitochondrial RNA regulation and expands the current understanding of how mitochondrial gene expression is controlled.

11
Tn3-derived inverted-repeat miniature elements (TIMEs) that mobilize antibiotic resistance genes

Gomi, R.; Yano, H.

2026-02-25 microbiology 10.1101/2025.11.05.686661 medRxiv
Top 0.1%
3.7%
Show abstract

Miniature inverted-repeat transposable elements (MITEs) are nonautonomous mobile genetic elements (MGEs) that can be mobilized by transposases provided by the relevant autonomous MGEs. MITEs originating from Tn3-family transposons were previously termed Tn3-derived inverted-repeat miniature elements (TIMEs). Composite transposon-like structures bounded by two copies of TIME, called TIME-COMPs, were shown to mobilize the intervening sequences. However, their association with antibiotic resistance genes (ARGs) has not yet been systematically studied. This study thus aimed to identify new TIME-COMP-like structures containing ARGs in the genomic sequences of the clinically important bacterial family Enterobacteriaceae in public databases. TIME-COMP-like structures were first searched for in the plasmid database PLSDB, focusing on small plasmids, using a self-against-self blastn approach to identify repeated elements. Then, newly and previously identified MITEs (including TIMEs) were searched for in the NCBI core nucleotide database to identify TIME-COMP-like structures located on other replicons. Bioinformatic analysis identified multiple previously unreported TIME-COMPs containing ARGs, which are bounded by directly or inversely oriented TIMEs, namely, IS101, MITESen1, and a novel 244-bp TIME termed TIME244. TIME244 contains a putative resolution site related to that of Tn21. These TIMEs were predominantly detected in plasmids and very rarely in chromosomes. The ARGs embedded in newly identified TIME-COMPs were blaKPC-2, floR, qnrS1, and tet(A). Notably, the blaKPC-2 carbapenemase gene was found in TIME-COMPs bounded by TIME244 and a TIME-COMP bounded by IS101. These findings highlight a potential role for TIMEs in the spread of diverse ARGs. IMPACT STATEMENTBacterial miniature inverted-repeat transposable elements (MITEs) are a group of short (50 bp-500 bp) nonautonomous transposable elements that are thought to have originated from insertion sequences or transposons. Although MITEs can theoretically mobilize antibiotic resistance genes (ARGs) in the presence of transposases, only a few studies have reported their association with ARGs, probably due to difficulties in identifying MITEs in genomic sequences. This study provides evidence, based on bioinformatic analysis of public Enterobacteriaceae genomes, that a subset of MITEs, called Tn3-derived inverted-repeat miniature elements (TIMEs), mobilizes ARGs by forming composite transposon-like structures. A novel 244-bp TIME, designated TIME244, was present in more than 100 Enterobacteriaceae plasmids in the current RefSeq database, suggesting its further transmission in bacterial populations through horizontal gene transfer. This study reveals that TIMEs were often overlooked when analyzing the genetic contexts of ARGs in previous studies. These findings highlight the importance of TIMEs in bacterial gene acquisition and underscore the need for new tools that can detect TIMEs in bacterial genomes for ARG surveillance. DATA SUMMARYAccession numbers of sequence data analyzed in this study are provided within the article or in supplementary data files.

12
Rapid evolution and comparative analysis of piRNA clusters in D.simulans

Narayanan, P.; Srivastav, S.; Signor, S.

2026-01-20 evolutionary biology 10.64898/2026.01.19.700409 medRxiv
Top 0.1%
3.6%
Show abstract

Eukaryotic genomes are ubiquitously occupied by mobile genetic elements termed transposons, which are silenced via a specialized class of small RNA called piRNA. The small RNA is produced from the transposons themselves when they occupy specialized regions of the genome termed piRNA clusters. The formation of these specialized regions, or their evolution over time, is not well understood. Recent work has suggested that they are extremely variable even within a single species such as Drosophila melanogaster. We were interested in taking a comparative approach to piRNA cluster evolution to ask the question - what processes are unique to D. melanogaster and which are shared? Shared phenomena are more likely to be fundamental aspects of piRNA formation and evolution compared to those that are more labile. Using five high-quality long-read genome assemblies and five genotype-specific piRNA libraries, we approach this question from a population genetics standpoint. We annotate piRNA clusters, transposons, and structural variants in each of these five genomes. We found extensive variation in piRNA clusters across strains, with smaller piRNA clusters more likely to be limited to a single genotype. By and large, our results are consistent with a model of piRNA cluster evolution in which piRNA clusters are rapidly formed and lost, with a small subset increasing in frequency and length over time. However, we find that the TEs which nucleate the formation of small piRNA clusters are entirely distinct in D. simulans compared to D. melanogaster, and likely reflect its invasion history rather than any inherent property of the transposon to nucleate clusters. Therefore, while large common clusters can act as traps as has been posited for piRNA clusters, there are also numerous small clusters that are born and lost rapidly within a species.

13
A mdg4 Retrotransposon Screen for X-linked Female Sterile Alleles and its Relationship with the Transcription Factor OVO

Benner, L.; Oliver, B. C.

2026-02-14 genetics 10.64898/2026.02.12.705638 medRxiv
Top 0.1%
3.6%
Show abstract

In the germline, the mdg4 retrotransposon integrates in close proximity to the location of OVO DNA binding motifs, suggesting that insertion bias is driven by the OVO transcription factor. A classical genetic example of this is the reversion of the dominant female-sterile allele, ovoD1, by the transposition of mdg4 into the ovo promoter where OVO protein binds. We wanted to take advantage of this relationship and determine if we could recover female sterile alleles along the X chromosome due to mdg4 insertion, with the hypothesis that these would be genes that OVO binds and transcriptionally regulates in the germline. We mobilized the mdg4 retrotransposon with the use of mutants for the lncRNA gene flamenco (flam) and recovered 17 recessive female sterile alleles out of a total of 1,192 chromosomes screened. We identified 11 complementation groups, for which a mdg4 insertion was responsible for female sterility in 7 groups. Notably, a complementation group consisting of 6 alleles was found to be the result of a Doc transposable element insertion into the gene Grip91 and is potentially evidence for a Doc insertional hotspot in the genome. Our screen also uncovered that 7/17 recessive female sterile chromosomes contained multiple transposable element insertions indicating that flam- females derepress numerous transposable elements that can lead to multiple transposon insertions along a single chromosome, as has been suggested previously. Altogether, we found that mdg4 did have an insertion bias into OVO bound regions of the genome that can result in female sterility, however, this was the case for a minority of the female sterile alleles recovered with this method. Article SummaryThe retrotransposon mdg4 preferentially inserts near binding sites of the female germline transcription factor OVO in Drosophila melanogaster, most notably at the ovo locus itself. We leveraged this relationship to screen for X-linked recessive female-sterile mutations generated by mdg4 mobilization in flamenco mutant females. From 1,192 chromosomes, we recovered 17 female-sterile alleles across 11 complementation groups. mdg4 insertions were significantly enriched in OVO-bound regions but accounted for only a subset of sterility phenotypes, revealing substantial background mutagenesis by other transposable elements. These results refine the OVO-mdg4 relationship and highlight both the promise and limitations of transposon-based genetic screens.

14
IAP retrotransposons contribute to the transcriptional diversity of the murine placenta

Amante, S. M.; Vignola, M. L.; Pulver, C.; Bertozzi, T. M.; Ferguson-Smith, A. C.; Charalambous, M.; Branco, M. R.

2026-01-30 genetics 10.64898/2026.01.27.702056 medRxiv
Top 0.1%
3.4%
Show abstract

Transposable elements (TEs) have made important contributions to the evolution of the placenta, and are argued to have played a role in the wide inter-species diversification of this critical developmental organ. Co-option of TEs by host genomes has led to the genesis of important placental genes, as well as trophoblast-specific gene regulatory elements. In mice, past work has demonstrated how multiple species-specific TE subfamilies are used as transcriptional enhancers in trophoblast stem cells. However, the involvement of TEs in the regulation of mouse placental gene expression in vivo remains unclear. Here, we characterised the TE regulatory and transcriptional landscape in mouse placenta and gauged their evolutionary dynamics through a comparative approach. We found that overall, TE cis-regulatory activity is greatly diminished in differentiated mouse trophoblast when compared to their stem cell counterpart. On the other hand, evolutionarily young intracisternal A particle (IAP) elements are highly expressed in the placenta and create several alternative, placenta-specific transcriptional start sites for protein-coding genes. Placenta-expressed IAP elements are genetically polymorphic between mouse strains and drive species-specific expression of associated genes. These putative co-option events are therefore recent and may represent a prime example of how TE activity can drive fast placental evolution.

15
Knob K180 Constitutive Heterochromatin Of Maize Exhibit Tissue-Specific Chromatin Senstitive Profiles Distinct From Other Types Of Heterochromatins

Sattler, M. C.; Singh, A.; Bass, H. W.; Mondin, M.

2026-04-04 genetics 10.64898/2026.04.01.715864 medRxiv
Top 0.1%
2.6%
Show abstract

BackgroundMaize knobs are regions of constitutive heterochromatin that are readily identified in both meiotic and somatic chromosomes. These structures have been characterized as stable throughout the cell cycle, exhibiting late replication during the S-phase, and are composed of two specific families of highly repetitive DNA sequences: K180 and TR-1. Although widely used as cytogenetic markers due to their variability in number and chromosomal position across inbred lines, hybrids, and landraces, little is known about their chromatin structure and dynamics. In this study, we analyzed chromatin accessibility of knobs using DNS-seq data across four maize tissues representing distinct developmental stages. ResultsOur results reveal that K180 knobs exhibit tissue-specific variation in chromatin accessibility, transitioning between open and closed states during development. In contrast, the TR-1 knob of chromosome 4 remained consistently inaccessible across all tissues analyzed. A knob composed of both K180, and TR-1 further supported this observation, with only the K180 region showing dynamic accessibility. To validate these findings, we also analyzed other repetitive regions such as centromeres, which showed a uniformly closed chromatin structure similar to TR-1. These results suggest a unique developmental modulation of chromatin accessibility associated with K180 repeats. While the chromatin accessibility of knobs does not reach the levels observed at Transcription Start Sites (TSS), the comparison among different classes of repetitive DNA within maize constitutive heterochromatin provides compelling evidence for sequence-specific and tissue-specific chromatin dynamics. ConclusionsOur findings uncover a previously unrecognized property of maize knobs and establish a reference for future studies on chromatin organization and epigenetic regulation of repetitive DNA in plant genomes.

16
TEsingle enables locus-specific transposable element expression analysis at single-cell resolution

Forcier, T.; Cheng, E.; Tam, O. H.; Wunderlich, C.; Castilla-Vallmanya, L.; Jones, J. L.; Quaegebeur, A.; Barker, R. A.; Jakobsson, J.; Gale Hammell, M.

2026-03-22 genomics 10.64898/2026.03.19.712984 medRxiv
Top 0.1%
2.3%
Show abstract

Transposable elements (TEs) are mobile genetic sequences that can generate new copies of themselves via insertional mutations. These viral-like sequences comprise nearly half the human genome and are present in most genome wide sequencing assays. While only a small fraction of genomic TEs have retained their ability to transpose, TE sequences are often transcribed from their own promoters or as part of larger gene transcripts. Accurately assessing TE expression from each individual genomic TE locus remains an open problem in the field, due to the highly repetitive nature of these multi-copy sequences. These issues are compounded in single-cell and single-nucleus transcriptome experiments, where additional complications arise due to sparse read coverage and unprocessed mRNA introns. Here we present our tool for single-cell TE and gene expression analysis, TEsingle. Using synthetic datasets, we show the problems that arise when not properly accounting for intron retention events, failing to address uncertainty in alignment scoring, and failing to make use of unique molecular identifiers for transcript resolution. Addressing these challenges has enabled an accurate TE analysis suite that simultaneously tracks gene expression as well as locus-specific resolution of expressed TEs. We showcase the performance of TEsingle using single-nucleus profiles from substantia nigra (SN) tissues of Parkinsons Disease (PD) patients. We find examples of young and intact TEs that mark dopaminergic neurons (DA) as well as many young TEs from the LINE and ERV families that are elevated in PD neurons and glia. These results demonstrate that TE expression is highly cell-type and cellular-state specific and elevated in particular subsets of neurons, astrocytes, and microglia from PD patients.

17
Molecular and functional characterization of telomeric repeat-containing RNAs in Chinese hamster ovary cells

Domingues-Silva, B.; Azzalin, C. M.

2026-04-02 cell biology 10.64898/2026.04.01.715793 medRxiv
Top 0.1%
2.1%
Show abstract

Mammalian telomeric DNA comprises long tracts of tandem TTAGGG repeats. The same repeats are also found at internal chromosomal regions called interstitial telomeric sequences (ITSs). Telomeres are transcribed into UUAGGG-containing transcripts, named TERRA, which serve multiple functions in maintaining telomere integrity. Complementary RNAs containing C-rich telomeric repeats, named ARIA, have also been identified in few yeast mutants and mammalian cells with dysfunctional telomeres. The molecular features and functions of ARIA remain understudied, mainly due to its low abundance and the lack of suitable cellular systems. Here, we show that Chinese hamster ovary (CHO) cells produce abundant TERRA and ARIA transcripts, predominantly originating from ITSs. Both RNAs are polyadenylated, exhibit relatively short half-lives and form large cellular foci. We also show that ARIA depletion leads to exposure of single-stranded (ss) DNA at ITSs and that ssDNA exposure increases when ITS DNA is damaged. SsDNA formation does not require the DNA damage signaling kinases ATM and ATR, nor the exonucleases DNA2 and EXO1; however, ATM prevents excessive ssDNA accumulation when ARIA function is inhibited. These findings establish CHO cells as a powerful model to dissect telomeric RNA functions and reveal ARIA as a key regulator of telomeric repeat DNA integrity.

18
Repetitive DNA shapes genome architecture and chromosomal diversification in birds of prey

Souza, G. M.; Vidal, J. A. D.; Toma, G. A.; Kretschmer, R.; DE OLIVEIRA, E. H. C.; Liehr, T.; Cioffi, M. d. B.

2026-03-05 genetics 10.64898/2026.03.03.709396 medRxiv
Top 0.1%
2.0%
Show abstract

The evolution of genome architecture occurs through dynamic interactions between repetitive DNAs and chromosomal organization; nevertheless, the processes underlying these mechanisms are not well understood. This study presents a comprehensive genomic and cytogenetic analysis of repetitive DNA evolution across Accipitridae birds, a raptor family notable for its significant chromosomal variation. We aimed to investigate how repetitive DNAs have evolved across Accipitriform lineages and test whether shifts in repeat composition are associated with patterns of species diversification. Comparative investigations of eight genomes reveal lineage-specific spikes of transposable elements and satellite DNAs that substantially modify genome composition while preserving a common structural framework. Temporal insertion profiles indicate that repeat turnover is ongoing and frequently coincides with lineages exhibiting extensive chromosomal reorganization. By integrating comparative repeatome analyses with in silico and cytogenetic mapping, we elucidate the spatial architecture governing repeat dynamics, connecting molecular turnover to their chromosomal structure. These findings underscore the effectiveness of merging genomic and chromosomal data to elucidate the impact of repeat landscapes on chromosomal and genomic evolution.

19
Distinct modes of sequence evolution and epigenetic modifications underpin the origins of Starship-mediated variation in Pyricularia fungal plant pathogens

O'Donnell, S.; McVey, A.; Valent, B.; Liu, S.; Gluck-Thaler, E.; Cook, D.

2026-01-29 genomics 10.64898/2026.01.28.702382 medRxiv
Top 0.1%
2.0%
Show abstract

Fungal pathogens display remarkable variation in genome content and organization that directly impacts their survival and host interactions. Although numerous models have been proposed to explain the origins of this variation, they generally fail to explain or predict the mechanisms that generate the genome variation observed in natural populations. Starships are a recently discovered group of giant fungal transposons that carry dozens of genes as cargo and horizontally transfer both within and between species. Here, we identify the features of a newly defined "Starship compartment" in the major fungal plant pathogen Pyricularia oryzae. We test the hypothesis that the Starship compartment makes distinct contributions to fungal genome evolution by explicitly comparing its transferability, mutability, and epigenetic modifications with those of the canonical core and accessory compartments. To enable this, we developed an updated and user-friendly version of the annotation tool stargraph for the comprehensive annotation of Starships and Starship-like regions. Using this approach, we identified two distinct families of Starships and related Starship-like regions in P. oryzae that differ in their activity, impacts on genome organization, modes of sequence evolution, and epigenetic modifications. Elements from the more active family exhibit higher rates of structural variation than all other genomic compartments in the predominantly clonal isolates infecting rice. Both families of Starships encode specific suites of known effector sequences that contribute to plant disease and Starship activity accounts for avirulence gene turnover, which suggests that evolutionary change within the Starship compartment may subsequently impact the evolution of plant-fungal interactions. Starships from the more active family have repeatedly transferred across the Pyricularia genus and tend to be depleted in heterochromatic histone modifications and repeat-induced point mutations. However, contrasting histone modification profiles in this family suggests a genomic conflict between silencing or maintaining Starship activity. Our findings demonstrate that variation in the mode of sequence diversification and epigenetic modification within the Starship compartment underpins the impacts of these giant transposons on fungal genome evolution. We argue for the explicit consideration of not only the Starship compartment but of element-specific dynamics when investigating the evolution of host-fungal interactions.

20
Taming the Genetic Fire: Transposable Element diversity across thermal environments in polychaetes

Lamothe, L.; Hourdez, S.; Robert, T.; Bonnivard, E.

2026-02-26 evolutionary biology 10.64898/2026.02.25.703748 medRxiv
Top 0.1%
1.7%
Show abstract

Genetic variation plays a central role in enabling organisms to adapt to ever-changing environments. Transposable elements (TEs) are key drivers of genetic variation and adaptation, partly due to their ability to respond to environmental changes, such as thermal variability, through transcriptional activation, potentially leading to insertion events. The new copies will eventually accumulate mutations, increasing the TE diversity in the genome. In this study, we investigated how the TE diversity varies across environments, contrasted by their average temperature and their thermal variability profile, using polychaete annelids as a model system. These primarily benthic organisms occupy a wide range of habitats, from polar waters to hydrothermal vents and temperate shores. TE diversity varied substantially among polychaete species, with significantly lower diversity observed in species inhabiting unstable habitats, such as those associated with hydrothermal vents. This link between TE diversity and environment was statistically consistent across the different TE orders, except for DIRS-like elements in Errantia polychaetes, that display a surprisingly high diversity. Our results suggest that TE diversity may be selected to balance the level of TE activation, linked to thermal variability, to maintain a sustainable mutation rate at the whole genome level. In unstable environments, high TE diversity may not be sustainable due to the accumulation of deleterious mutations, caused by a higher rate of stress-induced transposition compared to other habitats. These findings highlight the influence of environmental conditions on the long-term dynamics governing TE-host interactions and underscore the role of TEs in evolution.